Skip to content

Commit 7855a74

Browse files
Update README.md
1 parent ec485a6 commit 7855a74

File tree

8 files changed

+94
-61
lines changed

8 files changed

+94
-61
lines changed

README.md

Lines changed: 94 additions & 61 deletions
Original file line numberDiff line numberDiff line change
@@ -1,100 +1,133 @@
1-
# Diffbot LLM
1+
# Diffbot GraphRAG LLM
22

3-
Diffbot LLM is an API that combines the reasoning ability of frontier large language models with the knowledge
4-
found in Diffbot's Knowledge Graph and realtime web index using Retrieval Augmented Generation.
3+
## 1. Introduction
54

6-
Diffbot LLM is best suited for applications that require maximum accuracy and
7-
factual grounding with authoritative citations.
5+
Recently, large language models (LLMs) have been trained with more and more data, leading to an increase in the number of parameters and the compute power needed. But what if, instead of feeding the model more data, we purposefully trained it to rely less on its pretraining data and more on it's ability to find external knowledge?
86

9-
Diffbot LLM API is fully compatible with OpenAI's Chat Completion API and can be used as a drop-in replacement.
7+
To test this idea, we fine-tuned LLama 3.3 70B to be an expert tool user of a real-time Knowledge Graph API, providing the first open-source implementation of a GraphRAG system that outperforms Google Gemini and ChatGPT.
108

11-
## Diffbot LLM Inference Server
9+
## 2. Features
1210

13-
Diffbot LLM is available as a service hosted by Diffbot, or it can be self-hosted with the following open-source models:
14-
* diffbot-small (8b llama 3.1 fine tune): https://huggingface.co/diffbot/diffbot-small-1.0
15-
* diffbot-small-xl (70b llama 3.1 fine tune): https://huggingface.co/diffbot/diffbot-small-xl-1.0
11+
## Real-time web URL extraction
1612

17-
This repo contains the Diffbot LLM Inference, a system to serve the open-source Diffbot LLM models
18-
with built-in tool calling for Diffbot's Knowledge Graph and realtime web index.
13+
![extract example](./static/extract.webp)
1914

20-
This system requires an Nvidia GPU with at least 80G VRAM.
15+
As a RAG system, Diffbot LLM can summarize a web document in real-time, appropriately crediting the original source.
2116

22-
## Demo
17+
## Expert Retriever of Factual citations
2318

24-
A demo of Diffbot LLM is available at https://diffy.chat
19+
![Mission statement of the FAA](./static/faa.webp)
2520

26-
![DiffyChat](static/demo.png)
21+
Diffbot LLM is explicitly trained to align the cited text with the reference source.
2722

28-
## Usage
23+
## Knowledge Graph Querying
2924

30-
### API
25+
![which state contains J?](./static/newjersey.webp)
3126

32-
#### Who is the CEO of Nike?
27+
Diffbot LLM is an expert tool user of the Diffbot (Knowledge Graph) Query Language.
3328

34-
```python
35-
from openai import OpenAI
36-
# get your free token at https://app.diffbot.com/get-started/
37-
diffbot_token = "<YOUR_TOKEN>"
38-
base_url = "http://<YOUR_SERVER>:8001/rag/v1" # or https:/llm.diffbot.com/rag/v1 for Diffbot-hosted
39-
client = OpenAI(api_key=diffbot_token, base_url=base_url)
40-
completion = client.chat.completions.create(
41-
model="diffbot-small",
42-
temperature=0,
43-
messages=[
44-
{"role": "system", "content": "You are a helpful assistant."},
45-
{
46-
"role": "user",
47-
"content": "Who is Nike's CEO?"
48-
}
49-
]
50-
)
51-
print (completion)
52-
```
29+
## Image Entailment
30+
31+
![How to draw baby shark](./static/babyshark.webp)
5332

54-
## Evaluation
33+
Diffbot LLM an also entail images.
5534

56-
### MMLU-Pro
35+
## Code Interpreter Tool Use
36+
37+
![strawberry problem](./static/strawberry.webp)
38+
39+
40+
Instead of relying on the model weights for performing empirical calculations, Diffbot LLM is an expert tool user of a Javascript interpreter that it can use to inform it's response.
41+
42+
![is 9.11 or 9.9 larger](./static/math.webp)
43+
44+
## Fun stuff
5745

46+
![weather in Menlo park](./static/weather.webp)
5847

59-
### FreshQA
48+
Diffbot LLM is an expert maker of ASCII-art weather forecasts, grounded in real sources.
6049

61-
![Accuracy for FreshQA 2024 queries](static/freshqa.png)
50+
## 3. Model Download
6251

63-
FreshQA is a dynamic question answering benchmark encompassing a diverse range of question and answer types, including questions that
64-
require fast-changing world knowledge as well as questions with false premises that need to be debunked.
52+
Available on HuggingFace at:
53+
* diffbot-small (8b Llama 3.1 fine tune): https://huggingface.co/diffbot/Llama-3.1-Diffbot-Small-2412
54+
* diffbot-small-xl (70b Llama 3.3 fine tune): https://huggingface.co/diffbot/Llama-3.3-Diffbot-Small-XL-2412
55+
56+
## 4. Accuracy Benchmarks
57+
58+
### FreshQA Dataset
59+
60+
![Accuracy for FreshQA 2024 queries](./static/freshqa.png)
61+
62+
[FreshQA](https://arxiv.org/abs/2310.03214) is a benchmark that measures real-time accuracy for search RAG systems. Diffbot LLM outperforms gpt-4o (no web access), ChatGPT (with web access), Google Gemini, and Perplexity on real-time factual accuracy.
6563

6664
In this evaluation, we focus on 130 FreshQA questions whose answer have changed in 2024, which is after the knowledge
6765
cutoff for all evaluated models as of December 2024.
6866

69-
## Pricing
67+
### MMLU-Pro
68+
69+
[MMLU-Pro](https://arxiv.org/abs/2406.01574) is a more difficult version of the [MMLU](https://arxiv.org/abs/2009.03300) benchmark that tests for static knowledge of 57 academic subjects using a 10-choice multiple-choice questions. [MMLU-Pro Leaderboard](https://huggingface.co/spaces/TIGER-Lab/MMLU-Pro).
7070

71-
Get a free token at: https://app.diffbot.com/get-started/
71+
Below shows the MMLU-Pro scores of diffbot-small and diffbot-small-xl over the base models it was fine-tuned from.
7272

73-
Contact [email protected] if need more credits or higher limits.
73+
| Model | Accuracy (CoT 5-shot) |
74+
| ----- | ----------------- |
75+
| diffbot-small-xl | 72.89 |
76+
| Llama-3.3-70B Instruct | 65.92 |
7477

75-
## Self-Hosting
78+
| Model | Accuracy (CoT 5-shot) |
79+
| ----- | ----------------- |
80+
| diffbot-small | 48.64 |
81+
| Llama-3.1-8B Instruct | 44.25 |
7682

77-
### Using Docker image and models in huggingface
83+
Note: This is a measurement of the Diffbot GraphRAG LLM API end-to-end, not a measure of the knowledge contained in the weights. The lift in its performance over the base model comes from its ability to access external tools.
84+
85+
86+
## 5. Demo
87+
88+
Try Diffbot LLM using the demo app at https://diffy.chat
89+
90+
## 6. Running Locally
91+
92+
Tested minimum hardware configurations:
93+
94+
- Nvidia A100 40G for diffbot-small
95+
- Nvidia 2XH100 80G for diffbot-small-xl @ FP8
96+
97+
Using Docker image and models in huggingface
7898
1. Pull docker image: `docker pull docker.io/diffbot/diffbot-llm-inference:latest`
7999
2. Run docker image. **Note: The model weights will be automatically downloaded from huggingface.
80100
This might take a few minutes.**
81101

82102
```bash
83-
docker run --runtime nvidia --gpus all -p 8001:8001 --ipc=host -e VLLM_OPTIONS="--model diffbot/diffbot-small-1.0 --served-model-name diffbot-small --enable-prefix-caching" docker.io/diffbot/diffbot-llm-inference:latest
103+
docker run --runtime nvidia --gpus all -p 8001:8001 --ipc=host -e VLLM_OPTIONS="--model diffbot/Llama-3.1-Diffbot-Small-2412 --served-model-name diffbot-small --enable-prefix-caching" docker.io/diffbot/diffbot-llm-inference:latest
84104
```
105+
## 7. Using the Serverless API
85106

86-
## Extending Diffbot LLM Inference Server
107+
Get a free Diffbot developer token at https://app.diffbot.com/get-started
87108

88-
To extend the Diffbot LLM Inference Server with new tools, please refer to [this tutorial](add_tool_to_diffbot_llm_inference.md).
109+
```python
110+
from openai import OpenAI
89111

90-
## Diffbot-Hosted Service
112+
client = OpenAI(
113+
base_url = "https://llm.diffbot.com/rag/v1",
114+
api_key = "<diffbot_token>"
115+
)
91116

92-
To test the Diffbot LLM Inference Server before self-hosting, set the base_url to `https:/llm.diffbot.com/rag/v1`
117+
completion = client.chat.completions.create(
118+
model="diffbot-xl-small",
119+
temperature=0,
120+
messages=[
121+
{
122+
"role": "user",
123+
"content": "What is the Diffbot Knowledge Graph?"
124+
}
125+
]
126+
)
127+
print (completion)
128+
```
129+
Contact [email protected] if need more credits or higher limits.
93130

94-
```python
95-
from openai import OpenAI
96-
# get your free token at https://app.diffbot.com/get-started/
97-
diffbot_token = "<YOUR_TOKEN>"
98-
base_url = "https:/llm.diffbot.com/rag/v1"
99-
client = OpenAI(api_key=diffbot_token, base_url=base_url)
100-
```
131+
## 8. Adding Custom Tools
132+
133+
To extend the Diffbot LLM Inference Server with new tools, please refer to [this tutorial](add_tool_to_diffbot_llm_inference.md).

static/babyshark.webp

3.09 MB
Binary file not shown.

static/extract.webp

1.1 MB
Binary file not shown.

static/faa.webp

153 KB
Binary file not shown.

static/math.webp

383 KB
Binary file not shown.

static/newjersey.webp

497 KB
Binary file not shown.

static/strawberry.webp

464 KB
Binary file not shown.

static/weather.webp

171 KB
Binary file not shown.

0 commit comments

Comments
 (0)