Skip to content

Commit 07fa76e

Browse files
authored
Merge pull request #27 from SamEdwardes/uv-and-python-3-12
uv, python 3.12, only ._.blob access, remove textblob-de
2 parents 9a87f3d + 26811ec commit 07fa76e

21 files changed

+2612
-2334
lines changed

.github/workflows/pytest.yml

Lines changed: 6 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -7,37 +7,27 @@ on:
77
pull_request:
88
branches: [ main ]
99
push:
10-
branches:
10+
branches:
1111
- main
12+
workflow_dispatch:
1213

1314
jobs:
1415
build:
1516

1617
runs-on: ubuntu-latest
1718
strategy:
1819
matrix:
19-
python-version: ["3.7", "3.8", "3.9"]
20+
python-version: ["3.9", "3.10", "3.11", "3.12"]
2021

2122
steps:
2223
- uses: actions/checkout@v2
2324
- name: Set up Python ${{ matrix.python-version }}
2425
uses: actions/setup-python@v2
2526
with:
2627
python-version: ${{ matrix.python-version }}
27-
- name: Install python dependencies
28+
- name: Setup uv
2829
run: |
29-
curl -sSL https://install.python-poetry.org | python3 -
30-
poetry export --without-hashes --output requirements.txt
31-
python -m pip install --upgrade pip
32-
pip install wheel
33-
pip install -r requirements.txt
34-
python -m textblob.download_corpora
35-
python -m spacy download en_core_web_sm
36-
pip install textblob-de
37-
python -m spacy download de_core_news_sm
38-
pip install textblob-fr
39-
python -m spacy download fr_core_news_sm
40-
pip install pytest
30+
curl -LsSf https://astral.sh/uv/install.sh | sh
4131
- name: Test with pytest
4232
run: |
43-
pytest
33+
uv run --python ${{ matrix.python-version }} --all-extras pytest

CONTRIBUTING.md

Lines changed: 10 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -4,13 +4,12 @@
44

55
## Development environment
66

7-
### poetry
7+
### uv
88

9-
`poetry` is used to manage python dependencies. See the docs on how to install python [https://python-poetry.org/](https://python-poetry.org/). To activate the poetry virtual environment run the following commands:
9+
`uv` is used to manage python dependencies. Run the following to install `uv`:
1010

1111
```bash
12-
poetry install
13-
poetry shell
12+
curl -LsSf https://astral.sh/uv/install.sh | sh
1413
```
1514

1615
### just
@@ -19,24 +18,26 @@ poetry shell
1918

2019
## Code formatting
2120

22-
Please use the [black](https://black.readthedocs.io/en/stable/) for formatting code before submitting a PR.
23-
2421
```bash
25-
black spacytextblob
22+
just format
2623
```
2724

2825
## Testing
2926

3027
Please validate that all tests pass before submitting a PR by running:
3128

3229
```bash
33-
pytest
30+
# Test against the latest supported version of Python
31+
just test
32+
33+
# Tet against all supported versions of Python
34+
just test-matrix
3435
```
3536

3637
## Docs
3738

3839
To build the docs and visually inspect the docs please run:
3940

4041
```bash
41-
just docs
42+
just preview-docs
4243
```

docs/api.md

Lines changed: 1 addition & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -34,7 +34,6 @@ When adding *spacytextblob* to your spaCy pipeline you can optionally pass addit
3434

3535
| Name | Type | Description |
3636
|------|------|-------------|
37-
| `blob_only` | `bool` | If True, *spacytextblob* will only expose `._.blob` and not attempt to expose `._.polarity`, `._.subjectivity`, or `._.assessments`. This should always be set to True when using TextBlob extensions. By default `False`. |
3837
| `custom_blob` | `Dict[str, str]` | The `"custom_blob"` key should be assigned to a dictionary that tells spaCy what function to replace `textblob.TextBlob` with. In this case, we want to replace it with `TextBlobDE`. The key of the dictionary is `"@misc"`. This tells spaCy to look into the misc section of the spaCy register. The value should be the string name of a function that you have registered with spaCy. See the [TextBlob extensions](tutorial/textblob_extensions.md) section for more details. |
3938

4039

@@ -45,7 +44,6 @@ from spacytextblob.spacytextblob import SpacyTextBlob
4544
nlp = spacy.load("de_core_news_sm")
4645

4746
nlp.add_pipe( "spacytextblob", config={
48-
"blob_only": ..., # bool
4947
"custom_blob": ... # Dict[str, str]
5048
})
5149
```
@@ -61,5 +59,5 @@ Using *spacytextblob* without an extension:
6159
Using *spacytextblob* with an extension:
6260

6361
```python
64-
{! docs/static/reference_code/textblob_de_example.py !}
62+
{! docs/static/reference_code/textblob_fr_example.py !}
6563
```

docs/changelog.md

Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,17 @@
11
# Changelog
22

3+
## 5.0.0 (2024-10-12)
4+
5+
**Breaking changes**
6+
7+
- Update supported Python versions from 3.9 to 3.12.
8+
- Removed support for the `textblob-de` extension. See [#25](https://github.com/SamEdwardes/spacytextblob/issues/25) for more details.
9+
- Removed support for accessing `._.polarity`, `._.sentiment`, `._.subjectivity`, and `._.assessments`. Now, only the `._.blob` attribute is exposed. All other textblob attributes should be access through it. For example: `._.blob.polarity`, `._.blob.sentiment`, `._.blob.subjectivity`, and `._.blob.sentiment_assessments.assessments`. This simplifies the code base and makes it easier to maintain. Lastly, this means that the config option `{"blob_only": bool}` was removed.
10+
11+
**Other changes**
12+
13+
- Use `uv` instead of `poetry`.
14+
315
## 4.0.0 (2022-02-19)
416

517
- New custom attribute `doc._.blob`, `span._.blob`, `token._.blob`.

docs/static/reference_code/spacytextblob_example.py

Lines changed: 4 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,8 @@
11
import spacy
2-
from spacytextblob.spacytextblob import SpacyTextBlob
32

4-
nlp = spacy.load('en_core_web_sm')
3+
from spacytextblob.spacytextblob import SpacyTextBlob # noqa: F401
4+
5+
nlp = spacy.load("en_core_web_sm")
56
text = "I had a really horrible day. It was the worst day ever! But every now and then I have a really good day that makes me happy."
67
nlp.add_pipe("spacytextblob")
78
doc = nlp(text)
@@ -16,4 +17,4 @@
1617
# [(['really', 'horrible'], -1.0, 1.0, None), (['worst', '!'], -1.0, 1.0, None), (['really', 'good'], 0.7, 0.6000000000000001, None), (['happy'], 0.8, 1.0, None)]
1718

1819
print(doc._.blob.ngrams())
19-
# [WordList(['I', 'had', 'a']), WordList(['had', 'a', 'really']), WordList(['a', 'really', 'horrible']), WordList(['really', 'horrible', 'day']), WordList(['horrible', 'day', 'It']), WordList(['day', 'It', 'was']), WordList(['It', 'was', 'the']), WordList(['was', 'the', 'worst']), WordList(['the', 'worst', 'day']), WordList(['worst', 'day', 'ever']), WordList(['day', 'ever', 'But']), WordList(['ever', 'But', 'every']), WordList(['But', 'every', 'now']), WordList(['every', 'now', 'and']), WordList(['now', 'and', 'then']), WordList(['and', 'then', 'I']), WordList(['then', 'I', 'have']), WordList(['I', 'have', 'a']), WordList(['have', 'a', 'really']), WordList(['a', 'really', 'good']), WordList(['really', 'good', 'day']), WordList(['good', 'day', 'that']), WordList(['day', 'that', 'makes']), WordList(['that', 'makes', 'me']), WordList(['makes', 'me', 'happy'])]
20+
# [WordList(['I', 'had', 'a']), WordList(['had', 'a', 'really']), WordList(['a', 'really', 'horrible']), WordList(['really', 'horrible', 'day']), WordList(['horrible', 'day', 'It']), WordList(['day', 'It', 'was']), WordList(['It', 'was', 'the']), WordList(['was', 'the', 'worst']), WordList(['the', 'worst', 'day']), WordList(['worst', 'day', 'ever']), WordList(['day', 'ever', 'But']), WordList(['ever', 'But', 'every']), WordList(['But', 'every', 'now']), WordList(['every', 'now', 'and']), WordList(['now', 'and', 'then']), WordList(['and', 'then', 'I']), WordList(['then', 'I', 'have']), WordList(['I', 'have', 'a']), WordList(['have', 'a', 'really']), WordList(['a', 'really', 'good']), WordList(['really', 'good', 'day']), WordList(['good', 'day', 'that']), WordList(['day', 'that', 'makes']), WordList(['that', 'makes', 'me']), WordList(['makes', 'me', 'happy'])]

docs/static/reference_code/textblob_de_example.py

Lines changed: 8 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -1,22 +1,21 @@
11
import spacy
2-
from spacytextblob.spacytextblob import SpacyTextBlob
32
from textblob_de import TextBlobDE
43

4+
from spacytextblob.spacytextblob import SpacyTextBlob # noqa: F401
55

6-
text = '''
7-
Heute ist der 3. Mai 2014 und Dr. Meier feiert seinen 43. Geburtstag. Ich muss
8-
unbedingt daran denken, Mehl, usw. für einen Kuchen einzukaufen. Aber leider
6+
text = """
7+
Heute ist der 3. Mai 2014 und Dr. Meier feiert seinen 43. Geburtstag. Ich muss
8+
unbedingt daran denken, Mehl, usw. für einen Kuchen einzukaufen. Aber leider
99
habe ich nur noch EUR 3.50 in meiner Brieftasche.
10-
'''
10+
"""
11+
1112

1213
@spacy.registry.misc("spacytextblob.de_blob")
1314
def create_de_blob():
1415
return TextBlobDE
1516

16-
config = {
17-
"blob_only": True,
18-
"custom_blob": {"@misc": "spacytextblob.de_blob"}
19-
}
17+
18+
config = {"custom_blob": {"@misc": "spacytextblob.de_blob"}}
2019

2120
nlp = spacy.load("de_core_news_sm")
2221
nlp.add_pipe("spacytextblob", config=config)

docs/static/reference_code/textblob_example.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -13,4 +13,4 @@
1313
# [(['really', 'horrible'], -1.0, 1.0, None), (['worst', '!'], -1.0, 1.0, None), (['really', 'good'], 0.7, 0.6000000000000001, None), (['happy'], 0.8, 1.0, None)]
1414

1515
print(blob.ngrams())
16-
# [WordList(['I', 'had', 'a']), WordList(['had', 'a', 'really']), WordList(['a', 'really', 'horrible']), WordList(['really', 'horrible', 'day']), WordList(['horrible', 'day', 'It']), WordList(['day', 'It', 'was']), WordList(['It', 'was', 'the']), WordList(['was', 'the', 'worst']), WordList(['the', 'worst', 'day']), WordList(['worst', 'day', 'ever']), WordList(['day', 'ever', 'But']), WordList(['ever', 'But', 'every']), WordList(['But', 'every', 'now']), WordList(['every', 'now', 'and']), WordList(['now', 'and', 'then']), WordList(['and', 'then', 'I']), WordList(['then', 'I', 'have']), WordList(['I', 'have', 'a']), WordList(['have', 'a', 'really']), WordList(['a', 'really', 'good']), WordList(['really', 'good', 'day']), WordList(['good', 'day', 'that']), WordList(['day', 'that', 'makes']), WordList(['that', 'makes', 'me']), WordList(['makes', 'me', 'happy'])]
16+
# [WordList(['I', 'had', 'a']), WordList(['had', 'a', 'really']), WordList(['a', 'really', 'horrible']), WordList(['really', 'horrible', 'day']), WordList(['horrible', 'day', 'It']), WordList(['day', 'It', 'was']), WordList(['It', 'was', 'the']), WordList(['was', 'the', 'worst']), WordList(['the', 'worst', 'day']), WordList(['worst', 'day', 'ever']), WordList(['day', 'ever', 'But']), WordList(['ever', 'But', 'every']), WordList(['But', 'every', 'now']), WordList(['every', 'now', 'and']), WordList(['now', 'and', 'then']), WordList(['and', 'then', 'I']), WordList(['then', 'I', 'have']), WordList(['I', 'have', 'a']), WordList(['have', 'a', 'really']), WordList(['a', 'really', 'good']), WordList(['really', 'good', 'day']), WordList(['good', 'day', 'that']), WordList(['day', 'that', 'makes']), WordList(['that', 'makes', 'me']), WordList(['makes', 'me', 'happy'])]
Lines changed: 8 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -1,26 +1,26 @@
11
import spacy
2-
from spacytextblob.spacytextblob import SpacyTextBlob
32
from textblob import Blobber
4-
from textblob_fr import PatternTagger, PatternAnalyzer
3+
from textblob_fr import PatternAnalyzer, PatternTagger
4+
5+
from spacytextblob.spacytextblob import SpacyTextBlob # noqa: F401
6+
57

68
@spacy.registry.misc("spacytextblob.fr_blob")
79
def create_fr_blob():
810
tb = Blobber(pos_tagger=PatternTagger(), analyzer=PatternAnalyzer())
911
return tb
1012

11-
config = {
12-
"blob_only": True,
13-
"custom_blob": {"@misc": "spacytextblob.fr_blob"}
14-
}
13+
14+
config = {"custom_blob": {"@misc": "spacytextblob.fr_blob"}}
1515

1616
nlp_fr = spacy.load("fr_core_news_sm")
1717
nlp_fr.add_pipe("spacytextblob", config=config)
18-
text = u"Quelle belle matinée"
18+
text = "Quelle belle matinée"
1919
doc = nlp_fr(text)
2020

2121
print(doc)
2222
# Quelle belle matinée
2323
print(doc._.blob)
2424
# Quelle belle matinée
2525
print(doc._.blob.sentiment)
26-
# Quelle belle matinée
26+
# (0.8, 0.8)

docs/tutorial/textblob_extensions.md

Lines changed: 27 additions & 49 deletions
Original file line numberDiff line numberDiff line change
@@ -9,76 +9,49 @@ TextBlob supports adding custom models and new languages through “extensions
99
```python linenums="1"
1010
import spacy
1111
from spacytextblob.spacytextblob import SpacyTextBlob
12-
from textblob_de import TextBlobDE # (1)
12+
from textblob import Blobber
13+
from textblob_fr import PatternTagger, PatternAnalyzer # (1)
1314

15+
text = u"Quelle belle matinée"
1416

15-
text = '''
16-
Heute ist der 3. Mai 2014 und Dr. Meier feiert seinen 43. Geburtstag. Ich muss
17-
unbedingt daran denken, Mehl, usw. für einen Kuchen einzukaufen. Aber leider
18-
habe ich nur noch EUR 3.50 in meiner Brieftasche.
19-
'''
17+
@spacy.registry.misc("spacytextblob.fr_blob") # (2)
18+
def create_fr_blob():
19+
tb = Blobber(pos_tagger=PatternTagger(), analyzer=PatternAnalyzer())
20+
return tb # (3)
2021

21-
@spacy.registry.misc("spacytextblob.de_blob") # (2)
22-
def create_de_blob():
23-
return TextBlobDE # (3)
22+
nlp_fr = spacy.load("fr_core_news_sm")
2423

25-
26-
nlp = spacy.load("de_core_news_sm")
27-
28-
nlp.add_pipe(
29-
"spacytextblob", # (4)
24+
nlp_fr.add_pipe(
25+
"spacytextblob", # (4)
3026
config={ # (5)
31-
"blob_only": True, # (6)
32-
"custom_blob": {"@misc": "spacytextblob.de_blob"} # (7)
27+
"custom_blob": {"@misc": "spacytextblob.fr_blob"} # (6)
3328
}
3429
)
35-
doc = nlp(text)
3630

37-
print(doc._.blob.sentences)
38-
# [Sentence("Heute ist der 3. Mai 2014 und Dr. Meier feiert seinen 43. Geburtstag."), Sentence("Ich muss unbedingt daran denken, Mehl, usw. für einen Kuchen einzukaufen."), Sentence("Aber leider habe ich nur noch EUR 3.50 in meiner Brieftasche.")]
31+
doc = nlp_fr(text)
3932

33+
print(doc)
34+
# Quelle belle matinée
35+
print(doc._.blob)
36+
# Quelle belle matinée
4037
print(doc._.blob.sentiment)
41-
# Sentiment(polarity=0.0, subjectivity=0.0)
42-
43-
print(doc._.blob.tags)
44-
# [('Heute', 'RB'), ('ist', 'VB'), ('der', 'DT'), ('3.', 'LS'), ('Mai', 'NN'), ('2014', 'CD'), ('und', 'CC'), ('Dr.', 'NN'), ('Meier', 'NN'), ('feiert', 'NN'), ('seinen', 'PRP$'), ('43.', 'CD'), ('Geburtstag', 'NN'), ('Ich', 'PRP'), ('muss', 'VB'), ('unbedingt', 'RB'), ('daran', 'RB'), ('denken', 'VB'), ('Mehl', 'NN'), ('usw.', 'IN'), ('für', 'IN'), ('einen', 'DT'), ('Kuchen', 'JJ'), ('einzukaufen', 'NN'), ('Aber', 'CC'), ('leider', 'VBN'), ('habe', 'VB'), ('ich', 'PRP'), ('nur', 'RB'), ('noch', 'IN'), ('EUR', 'NN'), ('3.50', 'CD'), ('in', 'IN'), ('meiner', 'JJ'), ('Brieftasche', 'NN')]
38+
# (0.8, 0.8)
4539
```
4640

4741
1. Load the TextBlob extension package.
48-
2. For a function to be used inside the NLP pipeline you must register the function with spacy using `@spacy.registry.misc()`. You can name the function what ever you like. For the example I have registered the function with the name `"spacytextblob.de_blob"`.
49-
3. *spacytextblob* is able to support TextBlob extensions by replacing the default `textblob.TextBlob` with an alternative. In the case of the [textblob-de](https://github.com/markuskiller/textblob-de) extension they provide an alternative blob that you can import (`from textblob_de import TextBlobDE`).
42+
2. For a function to be used inside the NLP pipeline you must register the function with spacy using `@spacy.registry.misc()`. You can name the function what ever you like. For the example I have registered the function with the name `"spacytextblob.fr_blob"`.
43+
3. *spacytextblob* is able to support TextBlob extensions by replacing the default `textblob.TextBlob` with an alternative.
5044
4. Add *spacytextblob* to your spaCy pipeline as you normally would.
5145
5. The `config` parameter allows you to pass additional configuration options to the *spacytextblob* pipeline.
52-
6. When using a TextBlob extension you should always set `"blob_only": True`. The extension may modify the textblob.TextBlob object. By setting `"blob_only": True` *spacytextblob* will only expose `._.blob` and not attempt to expose `._.polarity`, `._.subjectivity`, or `._.assessments`.
53-
7. The `"custom_blob"` key should be assigned to a dictionary that tells spaCy what function to replace `textblob.TextBlob` with. In this case, we want to replace it with `TextBlobDE`. The key of the dictionary is `"@misc"`. This tells spaCy to look into the misc section of the spaCy register. The value should be the string name of the function that we registered above in line 12.
46+
6. The `"custom_blob"` key should be assigned to a dictionary that tells spaCy what function to replace `textblob.TextBlob` with. In this case, we want to replace it with `TextBlobDE`. The key of the dictionary is `"@misc"`. This tells spaCy to look into the misc section of the spaCy register. The value should be the string name of the function that we registered above in line 12.
5447

5548
## Extensions
5649

5750
The following extensions have been tested and are supported. Other extensions may work, but have not been tested.
5851

59-
### textblob-de
60-
61-
textblob-de is a TextBlob extensions that enables German language support for TextBlob by Steven Loria ([https://github.com/markuskiller/textblob-de](https://github.com/markuskiller/textblob-de)).
62-
63-
```bash
64-
pip install textblob-de
65-
```
66-
67-
To use it with *spacytextblob* First install a spaCy model that supports German ([https://spacy.io/models/de](https://spacy.io/models/de)):
68-
69-
```bash
70-
python -m spacy download de_core_news_sm
71-
```
72-
73-
The code below demonstrates how you can then use and access textblob-de within *spacytextblob*.
74-
75-
```python linenums="1"
76-
{! docs/static/reference_code/textblob_de_example.py !}
77-
```
78-
7952
### textblob-fr
8053

81-
textblob-fr is a TextBlob extension that enables French language support for TextBlob ([https://github.com/sloria/textblob-fr](https://github.com/sloria/textblob-fr)).
54+
textblob-fr is a TextBlob extension that enables French language support for TextBlob ([https://github.com/sloria/textblob-fr](https://github.com/sloria/textblob-fr)).
8255

8356
```bash
8457
pip install textblob-fr
@@ -96,9 +69,14 @@ The code below demonstrates how you can then use and access textblob-fr within *
9669
{! docs/static/reference_code/textblob_fr_example.py !}
9770
```
9871

72+
### textblob-de
73+
74+
!!! warning
75+
76+
textblob-de is **not** supported. As of spacytextblob 4.1.0. The textblob-de library depends on a Google Translate feature that no longer works. More details can be found in this issue [https://github.com/markuskiller/textblob-de/issues/24](https://github.com/markuskiller/textblob-de/issues/24).
77+
9978
### textblob-aptagger
10079

10180
!!! warning
10281

10382
textblob-aptagger is **not** supported. As of TextBlob 0.11.0, TextBlob uses NLTK's averaged perceptron tagger by default. This package is no longer necessary ([https://github.com/sloria/textblob-aptagger](https://github.com/sloria/textblob-aptagger)).
104-

0 commit comments

Comments
 (0)