Entity position information goes wrong when emojis appear before entities

Thank you for developing/maintaining this tool 🙏 

I've encountered seemingly wrong entity span information when I have emojis in a text.

### How to reproduce the issue

1. First in a jupyter notebook,

   ```python
   import jupyterannotate

   TEXTS = [
       "Hi John!",
       "👍 John!",
       "👍🏿 John!",
   ]

   annotation_widget = jupyterannotate.AnnotateWidget(
       docs=TEXTS,
       labels=["NAME"]
   )
   annotation_widget
   ```

1. Annotate "John" as NAME in all 3 texts.
1. Spans are set like this

   ```python
   spans = annotation_widget.spans
   spans
   
   # [[{'start': 3, 'end': 7, 'text': 'John', 'label': 'NAME'}],
   #  [{'start': 3, 'end': 7, 'text': 'John', 'label': 'NAME'}],
   #  [{'start': 5, 'end': 9, 'text': 'John', 'label': 'NAME'}]]
   ```

1. Expect slicing texts with position information all give "John", but actually not when emojis present before "John"s.

   ```python
   for i in range(len(TEXTS)):
       print(f'{i+1}. "{TEXTS[i][spans[i][0]["start"] : spans[i][0]["end"]]}"')
   
   # 1. "John"
   # 2. "ohn!"
   # 3. "hn!"
   ```

### Expected behaviour

It prints below (all the same)

```
1. "John"
2. "John"
3. "John"
```

### Possible cause

It seems to be related to the difference in how Python and JavaScript count string length. c.f. [JavaScript vs Python emoji length](https://stackoverflow.com/questions/69584227/javascript-vs-python-emoji-length)

Since this library works on Jupyter notebook (and behind the scene js being used is an implementation detail), it would be great if we can get text length in Python friendly way; so that we can do the further processing on the same notebook without any issues.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Entity position information goes wrong when emojis appear before entities #3

How to reproduce the issue

Expected behaviour

Possible cause

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Entity position information goes wrong when emojis appear before entities #3

Description

How to reproduce the issue

Expected behaviour

Possible cause

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions