Skip to content

Conversation

@shinaoka
Copy link
Contributor

@shinaoka shinaoka commented Dec 14, 2025

Summary

This PR adds a user-facing setting to choose the tokenizer mode for local message search. N-gram tokenization is essential for languages without clear word boundaries such as Japanese, Chinese, and Korean (CJK languages).

Changes

  • Add tokenizer mode setting (Standard/Japanese) in Message Search preferences
  • Pass tokenizer mode to Seshat via element-desktop IPC
  • Add confirmation dialog when changing tokenizer mode (requires reindex)

Dependencies

⚠️ This PR depends on:

Both upstream PRs must be merged before this PR can be merged.

Merge Order

1. Seshat PR #150  →  npm release (e.g., [email protected])
2. element-desktop PR  →  update matrix-seshat dependency
3. element-web (this PR)  →  UI for tokenizer mode selection

Checklist

@shinaoka shinaoka requested a review from a team as a code owner December 14, 2025 23:13
@github-actions github-actions bot added the Z-Community-PR Issue is solved by a community member's PR label Dec 14, 2025
- Add tokenizer mode setting (Standard/Japanese) in preferences
- Pass tokenizer mode to Seshat when indexing events
- Add confirmation dialog when changing tokenizer mode (requires reindex)
@t3chguy
Copy link
Member

t3chguy commented Dec 15, 2025

@shinaoka seems like you have lint & insufficient test coverage issues

@shinaoka
Copy link
Contributor Author

I have managed to fix the CI errors!

@t3chguy
Copy link
Member

t3chguy commented Jan 2, 2026

Blocked on matrix-org/seshat#150

@dbkr dbkr added the X-Needs-Product More input needed from the Product team label Jan 8, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

T-Enhancement X-Blocked X-Needs-Product More input needed from the Product team Z-Community-PR Issue is solved by a community member's PR

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants