Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 4 additions & 2 deletions unicodetools/data/ucd/dev/DerivedAge.txt
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
# DerivedAge-18.0.0.txt
# Date: 2025-11-21, 17:54:09 GMT
# Date: 2025-11-21, 19:14:36 GMT
# © 2025 Unicode®, Inc.
# Unicode and the Unicode Logo are registered trademarks of Unicode, Inc. in the U.S. and other countries.
# For terms of use and license, see https://www.unicode.org/terms_of_use.html
Expand Down Expand Up @@ -2126,6 +2126,8 @@ FDC8..FDCE ; 17.0 # [7] ARABIC LIGATURE RAHIMAHU ALLAAH TAAALAA..ARABIC LIG
058B..058C ; 18.0 # [2] MODIFIER LETTER ARMENIAN SMALL INI..MODIFIER LETTER ARMENIAN SMALL YI
05C8 ; 18.0 # HEBREW POINT SHEVA NA MUDGASH
0984 ; 18.0 # BENGALI SIGN COMBINING ANUSVARA ABOVE
1ADE..1ADF ; 18.0 # [2] COMBINING GRAVE-DOT..COMBINING DOT-ACUTE
1AEC..1AF0 ; 18.0 # [5] COMBINING CARON-ACUTE..COMBINING DOUBLE COMMA ABOVE
20C2..20C3 ; 18.0 # [2] RUFIYAA SIGN..UAE DIRHAM SIGN
10ED9..10EEE ; 18.0 # [22] ARABIC CROWN LETTER BEH..ARABIC CROWN LETTER YEH
10EF9 ; 18.0 # ARABIC MARK CROWN
Expand All @@ -2144,6 +2146,6 @@ FDC8..FDCE ; 17.0 # [7] ARABIC LIGATURE RAHIMAHU ALLAAH TAAALAA..ARABIC LIG
2B81E ; 18.0 # CJK UNIFIED IDEOGRAPH-2B81E
3D000..3FC3F ; 18.0 # [11328] SEAL CHARACTER-3D000..SEAL CHARACTER-3FC3F

# Total code points: 11775
# Total code points: 11782

# EOF
27 changes: 11 additions & 16 deletions unicodetools/data/ucd/dev/DerivedCoreProperties.txt
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
# DerivedCoreProperties-18.0.0.txt
# Date: 2025-11-21, 17:54:30 GMT
# Date: 2025-11-21, 19:14:57 GMT
# © 2025 Unicode®, Inc.
# Unicode and the Unicode Logo are registered trademarks of Unicode, Inc. in the U.S. and other countries.
# For terms of use and license, see https://www.unicode.org/terms_of_use.html
Expand Down Expand Up @@ -3255,8 +3255,7 @@ FF41..FF5A ; Cased # L& [26] FULLWIDTH LATIN SMALL LETTER A..FULLWIDTH LATIN
1AA7 ; Case_Ignorable # Lm TAI THAM SIGN MAI YAMOK
1AB0..1ABD ; Case_Ignorable # Mn [14] COMBINING DOUBLED CIRCUMFLEX ACCENT..COMBINING PARENTHESES BELOW
1ABE ; Case_Ignorable # Me COMBINING PARENTHESES OVERLAY
1ABF..1ADD ; Case_Ignorable # Mn [31] COMBINING LATIN SMALL LETTER W BELOW..COMBINING DOT-AND-RING BELOW
1AE0..1AEB ; Case_Ignorable # Mn [12] COMBINING LEFT TACK ABOVE..COMBINING DOUBLE RIGHTWARDS ARROW ABOVE
1ABF..1AF0 ; Case_Ignorable # Mn [50] COMBINING LATIN SMALL LETTER W BELOW..COMBINING DOUBLE COMMA ABOVE
1B00..1B03 ; Case_Ignorable # Mn [4] BALINESE SIGN ULU RICEM..BALINESE SIGN SURANG
1B34 ; Case_Ignorable # Mn BALINESE SIGN REREKAN
1B36..1B3A ; Case_Ignorable # Mn [5] BALINESE VOWEL SIGN ULU..BALINESE VOWEL SIGN RA REPA
Expand Down Expand Up @@ -3581,7 +3580,7 @@ E0001 ; Case_Ignorable # Cf LANGUAGE TAG
E0020..E007F ; Case_Ignorable # Cf [96] TAG SPACE..CANCEL TAG
E0100..E01EF ; Case_Ignorable # Mn [240] VARIATION SELECTOR-17..VARIATION SELECTOR-256

# Total code points: 2808
# Total code points: 2815

# ================================================

Expand Down Expand Up @@ -7603,8 +7602,7 @@ FFDA..FFDC ; ID_Start # Lo [3] HALFWIDTH HANGUL LETTER EU..HALFWIDTH HANGUL
1A90..1A99 ; ID_Continue # Nd [10] TAI THAM THAM DIGIT ZERO..TAI THAM THAM DIGIT NINE
1AA7 ; ID_Continue # Lm TAI THAM SIGN MAI YAMOK
1AB0..1ABD ; ID_Continue # Mn [14] COMBINING DOUBLED CIRCUMFLEX ACCENT..COMBINING PARENTHESES BELOW
1ABF..1ADD ; ID_Continue # Mn [31] COMBINING LATIN SMALL LETTER W BELOW..COMBINING DOT-AND-RING BELOW
1AE0..1AEB ; ID_Continue # Mn [12] COMBINING LEFT TACK ABOVE..COMBINING DOUBLE RIGHTWARDS ARROW ABOVE
1ABF..1AF0 ; ID_Continue # Mn [50] COMBINING LATIN SMALL LETTER W BELOW..COMBINING DOUBLE COMMA ABOVE
1B00..1B03 ; ID_Continue # Mn [4] BALINESE SIGN ULU RICEM..BALINESE SIGN SURANG
1B04 ; ID_Continue # Mc BALINESE SIGN BISAH
1B05..1B33 ; ID_Continue # Lo [47] BALINESE LETTER AKARA..BALINESE LETTER HA
Expand Down Expand Up @@ -8553,7 +8551,7 @@ FFDA..FFDC ; ID_Continue # Lo [3] HALFWIDTH HANGUL LETTER EU..HALFWIDTH HAN
3D000..3FC3F ; ID_Continue # Lo [11328] SEAL CHARACTER-3D000..SEAL CHARACTER-3FC3F
E0100..E01EF ; ID_Continue # Mn [240] VARIATION SELECTOR-17..VARIATION SELECTOR-256

# Total code points: 160997
# Total code points: 161004

# ================================================

Expand Down Expand Up @@ -9847,8 +9845,7 @@ FFDA..FFDC ; XID_Start # Lo [3] HALFWIDTH HANGUL LETTER EU..HALFWIDTH HANGU
1A90..1A99 ; XID_Continue # Nd [10] TAI THAM THAM DIGIT ZERO..TAI THAM THAM DIGIT NINE
1AA7 ; XID_Continue # Lm TAI THAM SIGN MAI YAMOK
1AB0..1ABD ; XID_Continue # Mn [14] COMBINING DOUBLED CIRCUMFLEX ACCENT..COMBINING PARENTHESES BELOW
1ABF..1ADD ; XID_Continue # Mn [31] COMBINING LATIN SMALL LETTER W BELOW..COMBINING DOT-AND-RING BELOW
1AE0..1AEB ; XID_Continue # Mn [12] COMBINING LEFT TACK ABOVE..COMBINING DOUBLE RIGHTWARDS ARROW ABOVE
1ABF..1AF0 ; XID_Continue # Mn [50] COMBINING LATIN SMALL LETTER W BELOW..COMBINING DOUBLE COMMA ABOVE
1B00..1B03 ; XID_Continue # Mn [4] BALINESE SIGN ULU RICEM..BALINESE SIGN SURANG
1B04 ; XID_Continue # Mc BALINESE SIGN BISAH
1B05..1B33 ; XID_Continue # Lo [47] BALINESE LETTER AKARA..BALINESE LETTER HA
Expand Down Expand Up @@ -10802,7 +10799,7 @@ FFDA..FFDC ; XID_Continue # Lo [3] HALFWIDTH HANGUL LETTER EU..HALFWIDTH HA
3D000..3FC3F ; XID_Continue # Lo [11328] SEAL CHARACTER-3D000..SEAL CHARACTER-3FC3F
E0100..E01EF ; XID_Continue # Mn [240] VARIATION SELECTOR-17..VARIATION SELECTOR-256

# Total code points: 160978
# Total code points: 160985

# ================================================

Expand Down Expand Up @@ -11024,8 +11021,7 @@ E01F0..E0FFF ; Default_Ignorable_Code_Point # Cn [3600] <reserved-E01F0>..<rese
1A7F ; Grapheme_Extend # Mn TAI THAM COMBINING CRYPTOGRAMMIC DOT
1AB0..1ABD ; Grapheme_Extend # Mn [14] COMBINING DOUBLED CIRCUMFLEX ACCENT..COMBINING PARENTHESES BELOW
1ABE ; Grapheme_Extend # Me COMBINING PARENTHESES OVERLAY
1ABF..1ADD ; Grapheme_Extend # Mn [31] COMBINING LATIN SMALL LETTER W BELOW..COMBINING DOT-AND-RING BELOW
1AE0..1AEB ; Grapheme_Extend # Mn [12] COMBINING LEFT TACK ABOVE..COMBINING DOUBLE RIGHTWARDS ARROW ABOVE
1ABF..1AF0 ; Grapheme_Extend # Mn [50] COMBINING LATIN SMALL LETTER W BELOW..COMBINING DOUBLE COMMA ABOVE
1B00..1B03 ; Grapheme_Extend # Mn [4] BALINESE SIGN ULU RICEM..BALINESE SIGN SURANG
1B34 ; Grapheme_Extend # Mn BALINESE SIGN REREKAN
1B35 ; Grapheme_Extend # Mc BALINESE VOWEL SIGN TEDUNG
Expand Down Expand Up @@ -11285,7 +11281,7 @@ FF9E..FF9F ; Grapheme_Extend # Lm [2] HALFWIDTH KATAKANA VOICED SOUND MARK.
E0020..E007F ; Grapheme_Extend # Cf [96] TAG SPACE..CANCEL TAG
E0100..E01EF ; Grapheme_Extend # Mn [240] VARIATION SELECTOR-17..VARIATION SELECTOR-256

# Total code points: 2237
# Total code points: 2244

# ================================================

Expand Down Expand Up @@ -13465,8 +13461,7 @@ ABC0..ABDA ; InCB; Consonant # Lo [27] MEETEI MAYEK LETTER KOK..MEETEI MAYEK
1A7F ; InCB; Extend # Mn TAI THAM COMBINING CRYPTOGRAMMIC DOT
1AB0..1ABD ; InCB; Extend # Mn [14] COMBINING DOUBLED CIRCUMFLEX ACCENT..COMBINING PARENTHESES BELOW
1ABE ; InCB; Extend # Me COMBINING PARENTHESES OVERLAY
1ABF..1ADD ; InCB; Extend # Mn [31] COMBINING LATIN SMALL LETTER W BELOW..COMBINING DOT-AND-RING BELOW
1AE0..1AEB ; InCB; Extend # Mn [12] COMBINING LEFT TACK ABOVE..COMBINING DOUBLE RIGHTWARDS ARROW ABOVE
1ABF..1AF0 ; InCB; Extend # Mn [50] COMBINING LATIN SMALL LETTER W BELOW..COMBINING DOUBLE COMMA ABOVE
1B00..1B03 ; InCB; Extend # Mn [4] BALINESE SIGN ULU RICEM..BALINESE SIGN SURANG
1B34 ; InCB; Extend # Mn BALINESE SIGN REREKAN
1B35 ; InCB; Extend # Mc BALINESE VOWEL SIGN TEDUNG
Expand Down Expand Up @@ -13721,6 +13716,6 @@ FF9E..FF9F ; InCB; Extend # Lm [2] HALFWIDTH KATAKANA VOICED SOUND MARK..HA
E0020..E007F ; InCB; Extend # Cf [96] TAG SPACE..CANCEL TAG
E0100..E01EF ; InCB; Extend # Mn [240] VARIATION SELECTOR-17..VARIATION SELECTOR-256

# Total code points: 2222
# Total code points: 2229

# EOF
5 changes: 2 additions & 3 deletions unicodetools/data/ucd/dev/EastAsianWidth.txt
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
# EastAsianWidth-18.0.0.txt
# Date: 2025-11-21, 17:54:37 GMT
# Date: 2025-11-21, 19:15:02 GMT
# © 2025 Unicode®, Inc.
# Unicode and the Unicode Logo are registered trademarks of Unicode, Inc. in the U.S. and other countries.
# For terms of use and license, see https://www.unicode.org/terms_of_use.html
Expand Down Expand Up @@ -808,8 +808,7 @@
1AA8..1AAD ; N # Po [6] TAI THAM SIGN KAAN..TAI THAM SIGN CAANG
1AB0..1ABD ; N # Mn [14] COMBINING DOUBLED CIRCUMFLEX ACCENT..COMBINING PARENTHESES BELOW
1ABE ; N # Me COMBINING PARENTHESES OVERLAY
1ABF..1ADD ; N # Mn [31] COMBINING LATIN SMALL LETTER W BELOW..COMBINING DOT-AND-RING BELOW
1AE0..1AEB ; N # Mn [12] COMBINING LEFT TACK ABOVE..COMBINING DOUBLE RIGHTWARDS ARROW ABOVE
1ABF..1AF0 ; N # Mn [50] COMBINING LATIN SMALL LETTER W BELOW..COMBINING DOUBLE COMMA ABOVE
1B00..1B03 ; N # Mn [4] BALINESE SIGN ULU RICEM..BALINESE SIGN SURANG
1B04 ; N # Mc BALINESE SIGN BISAH
1B05..1B33 ; N # Lo [47] BALINESE LETTER AKARA..BALINESE LETTER HA
Expand Down
6 changes: 3 additions & 3 deletions unicodetools/data/ucd/dev/LineBreak.txt
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
# LineBreak-18.0.0.txt
# Date: 2025-11-21, 17:54:40 GMT
# Date: 2025-11-21, 19:15:04 GMT
# © 2025 Unicode®, Inc.
# Unicode and the Unicode Logo are registered trademarks of Unicode, Inc. in the U.S. and other countries.
# For terms of use and license, see https://www.unicode.org/terms_of_use.html
Expand Down Expand Up @@ -776,9 +776,9 @@
1AA8..1AAD ; SA # Po [6] TAI THAM SIGN KAAN..TAI THAM SIGN CAANG
1AB0..1ABD ; CM # Mn [14] COMBINING DOUBLED CIRCUMFLEX ACCENT..COMBINING PARENTHESES BELOW
1ABE ; CM # Me COMBINING PARENTHESES OVERLAY
1ABF..1ADD ; CM # Mn [31] COMBINING LATIN SMALL LETTER W BELOW..COMBINING DOT-AND-RING BELOW
1AE0..1AEA ; CM # Mn [11] COMBINING LEFT TACK ABOVE..COMBINING UPWARDS ARROW ABOVE
1ABF..1AEA ; CM # Mn [44] COMBINING LATIN SMALL LETTER W BELOW..COMBINING UPWARDS ARROW ABOVE
1AEB ; GL # Mn COMBINING DOUBLE RIGHTWARDS ARROW ABOVE
1AEC..1AF0 ; CM # Mn [5] COMBINING CARON-ACUTE..COMBINING DOUBLE COMMA ABOVE
Comment on lines +779 to +781
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When they span two base characters, they are GL. When they just go over one base character, they are CM. The naming intersection is a little unfortunate.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, and looking at L2/24-232 the two new double accents below go under just one base character. Thanks!

1B00..1B03 ; CM # Mn [4] BALINESE SIGN ULU RICEM..BALINESE SIGN SURANG
1B04 ; CM # Mc BALINESE SIGN BISAH
1B05..1B33 ; AK # Lo [47] BALINESE LETTER AKARA..BALINESE LETTER HA
Expand Down
Loading
Loading