Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: improve Unicode character handling in rectangle operations #2146

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

devin-ai-integration[bot]
Copy link

Unicode Text Handling Improvements for Rectangle Operations

This PR improves the handling of Unicode characters in rectangle operations, particularly focusing on CJK characters, emoji, and combining marks.

Changes

  • Added comprehensive Unicode boundary detection for:
    • CJK characters (including full-width characters)
    • Emoji sequences
    • Combining marks and surrogate pairs
  • Preserved selection direction for both horizontal and vertical selections
  • Refactored character position adjustment logic for better clarity
  • Added detailed test cases for various Unicode scenarios

Testing

  • Added new test file src/test/suite/commands/rectangle-unicode.test.ts
  • Test cases cover:
    • Japanese text boundaries
    • Greek text with combining marks
    • Emoji sequences
    • Mixed CJK (Korean/Chinese) text
  • All ESLint and Prettier checks pass

Implementation Details

  • Added helper functions for Unicode character detection:
    • isValidCharacterBoundary: Checks if a position is a valid Unicode character boundary
    • isSurrogatePair: Detects surrogate pairs
    • isCombiningMark: Identifies combining marks
    • isFullWidth: Detects CJK and other full-width characters
    • isEmoji: Identifies emoji characters
  • Improved selection logic to respect character boundaries and maintain selection direction

Link to Devin run: https://app.devin.ai/sessions/d2255a9c45554e29943eb580739184c6

- Add comprehensive Unicode boundary detection for CJK, emoji, and combining marks
- Preserve selection direction for both horizontal and vertical selections
- Refactor character position adjustment logic for better clarity
- Add detailed test cases for various Unicode scenarios
Copy link
Author

🤖 Devin AI Engineer

I'll be helping with this pull request! Here's what you should know:

✅ I will automatically:

  • Address comments on this PR. Add "(aside)" to your comment to have me ignore it.
  • Look at CI failures and help fix them

⚙️ Control Options:

  • Disable automatic comment and CI monitoring

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

0 participants