Skip to content

Commit ffc385b

Browse files
authored
Merge pull request #4 from BeastByteAI/ner-docs
chain of thought docs
2 parents 0c87554 + f8a535d commit ffc385b

File tree

3 files changed

+50
-3
lines changed

3 files changed

+50
-3
lines changed
Lines changed: 46 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,46 @@
1+
---
2+
title: Chain-of-thought text classification
3+
nextjs:
4+
metadata:
5+
title: Chain-of-thought text classification
6+
description: Learn about chain-of-thought text classification.
7+
---
8+
9+
## Overview
10+
11+
Chain-of-thought text classification is similar to zero-shot classification since it does not require any labeled data beforehand. The only difference is that, in addition to the label itself, the model generates some additional reasoning behind its choice. In some cases, such an approach might lead to much better performance, but at the cost of higher token consumption.
12+
13+
Example using GPT-4o:
14+
15+
```python
16+
from skllm.models.gpt.classification.zero_shot import CoTGPTClassifier
17+
from skllm.datasets import get_classification_dataset
18+
19+
# demo sentiment analysis dataset
20+
# labels: positive, negative, neutral
21+
X, y = get_classification_dataset()
22+
23+
clf = CoTGPTClassifier(model="gpt-4o")
24+
clf.fit(X,y)
25+
predictions = clf.predict(X)
26+
labels, reasoning = predictions[:, 0], predictions[:, 1]
27+
```
28+
29+
---
30+
31+
## API Reference
32+
33+
The following API reference only lists the parameters needed for the initialization of the estimator. The remaining methods follow the syntax of a scikit-learn classifier.
34+
35+
### CoTGPTClassifier
36+
```python
37+
from skllm.models.gpt.classification.zero_shot import CoTGPTClassifier
38+
```
39+
40+
| **Parameter** | **Type** | **Description** |
41+
| ------------- | -------- | ------------------------ |
42+
| `model` | `str` | Model to use, by default "gpt-3.5-turbo". |
43+
| `default_label` | `str` | Default label for failed prediction; if "Random" -> selects randomly based on class frequencies, by default "Random". |
44+
| `prompt_template` | `Optional[str]` | Custom prompt template to use, by default None. |
45+
| `key` | `Optional[str]` | Estimator-specific API key; if None, retrieved from the global config, by default None. |
46+
| `org` | `Optional[str]` | Estimator-specific ORG key; if None, retrieved from the global config, by default None. |

src/app/docs/few-shot-text-classification/page.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -10,7 +10,7 @@ nextjs:
1010

1111
Few-shot text classification is a task of classifying a text into one of the pre-defined classes based on a few examples of each class. For example, given a few examples of the class _positive_, _negative_, and _neutral_, the model should be able to classify a new text into one of these classes.
1212

13-
The estimators provided by Scikit-LLM do not automatically select the subset of the training data, and instead use the entire training set to construct the examples. Therefore, if your training set is large, you might want to consider splitting it into training and validation sets, while keeping the training set small (we recommend not to exceed 10 examples per class).
13+
The estimators provided by Scikit-LLM do not automatically select the subset of the training data, and instead use the entire training set to construct the examples. Therefore, if your training set is large, you might want to consider splitting it into training and validation sets, while keeping the training set small (we recommend not to exceed 10 examples per class). Additionally, it is advisable to permute the order of the samples in order to [avoid the recency bias](https://github.com/iryna-kondr/scikit-llm/issues/104).
1414

1515
Example using GPT-4:
1616

@@ -26,13 +26,13 @@ from skllm.datasets import (
2626

2727
# single label
2828
X, y = get_classification_dataset()
29-
clf = FewShotGPTClassifier(model="gpt-4")
29+
clf = FewShotGPTClassifier(model="gpt-4o")
3030
clf.fit(X,y)
3131
labels = clf.predict(X)
3232

3333
# multi-label
3434
X, y = get_multilabel_classification_dataset()
35-
clf = MultiLabelFewShotGPTClassifier(max_labels=2, model="gpt-4")
35+
clf = MultiLabelFewShotGPTClassifier(max_labels=2, model="gpt-4o")
3636
clf.fit(X,y)
3737
labels = clf.predict(X)
3838
```

src/lib/navigation.js

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -13,6 +13,7 @@ export const navigation = [
1313
{ title: 'Zero-shot text classification', href: '/docs/zero-shot-text-classification' },
1414
{ title: 'Few-shot text classification', href: '/docs/few-shot-text-classification' },
1515
{ title: 'Dynamic few-shot text classification', href: '/docs/dynamic-few-shot-text-classification' },
16+
{ title: 'Chain-of-thought text classification', href: '/docs/chain-of-thought-text-classification' },
1617
{ title: 'Tunable text classification', href: '/docs/tunable-text-classification' },
1718
],
1819
},

0 commit comments

Comments
 (0)