|
1 |
| -# Diffbot LLM |
| 1 | +# Diffbot GraphRAG LLM |
2 | 2 |
|
3 |
| -Diffbot LLM is an API that combines the reasoning ability of frontier large language models with the knowledge |
4 |
| -found in Diffbot's Knowledge Graph and realtime web index using Retrieval Augmented Generation. |
| 3 | +## 1. Introduction |
5 | 4 |
|
6 |
| -Diffbot LLM is best suited for applications that require maximum accuracy and |
7 |
| -factual grounding with authoritative citations. |
| 5 | +Recently, large language models (LLMs) have been trained with more and more data, leading to an increase in the number of parameters and the compute power needed. But what if, instead of feeding the model more data, we purposefully trained it to rely less on its pretraining data and more on it's ability to find external knowledge? |
8 | 6 |
|
9 |
| -Diffbot LLM API is fully compatible with OpenAI's Chat Completion API and can be used as a drop-in replacement. |
| 7 | +To test this idea, we fine-tuned LLama 3.3 70B to be an expert tool user of a real-time Knowledge Graph API, providing the first open-source implementation of a GraphRAG system that outperforms Google Gemini and ChatGPT. |
10 | 8 |
|
11 |
| -## Diffbot LLM Inference Server |
| 9 | +## 2. Features |
12 | 10 |
|
13 |
| -Diffbot LLM is available as a service hosted by Diffbot, or it can be self-hosted with the following open-source models: |
14 |
| - * diffbot-small (8b llama 3.1 fine tune): https://huggingface.co/diffbot/diffbot-small-1.0 |
15 |
| - * diffbot-small-xl (70b llama 3.1 fine tune): https://huggingface.co/diffbot/diffbot-small-xl-1.0 |
| 11 | +## Real-time web URL extraction |
16 | 12 |
|
17 |
| -This repo contains the Diffbot LLM Inference, a system to serve the open-source Diffbot LLM models |
18 |
| -with built-in tool calling for Diffbot's Knowledge Graph and realtime web index. |
| 13 | + |
19 | 14 |
|
20 |
| -This system requires an Nvidia GPU with at least 80G VRAM. |
| 15 | +As a RAG system, Diffbot LLM can summarize a web document in real-time, appropriately crediting the original source. |
21 | 16 |
|
22 |
| -## Demo |
| 17 | +## Expert Retriever of Factual citations |
23 | 18 |
|
24 |
| -A demo of Diffbot LLM is available at https://diffy.chat |
| 19 | + |
25 | 20 |
|
26 |
| - |
| 21 | +Diffbot LLM is explicitly trained to align the cited text with the reference source. |
27 | 22 |
|
28 |
| -## Usage |
| 23 | +## Knowledge Graph Querying |
29 | 24 |
|
30 |
| -### API |
| 25 | + |
31 | 26 |
|
32 |
| -#### Who is the CEO of Nike? |
| 27 | + Diffbot LLM is an expert tool user of the Diffbot (Knowledge Graph) Query Language. |
33 | 28 |
|
34 |
| -```python |
35 |
| -from openai import OpenAI |
36 |
| -# get your free token at https://app.diffbot.com/get-started/ |
37 |
| -diffbot_token = "<YOUR_TOKEN>" |
38 |
| -base_url = "http://<YOUR_SERVER>:8001/rag/v1" # or https:/llm.diffbot.com/rag/v1 for Diffbot-hosted |
39 |
| -client = OpenAI(api_key=diffbot_token, base_url=base_url) |
40 |
| -completion = client.chat.completions.create( |
41 |
| - model="diffbot-small", |
42 |
| - temperature=0, |
43 |
| - messages=[ |
44 |
| - {"role": "system", "content": "You are a helpful assistant."}, |
45 |
| - { |
46 |
| - "role": "user", |
47 |
| - "content": "Who is Nike's CEO?" |
48 |
| - } |
49 |
| - ] |
50 |
| -) |
51 |
| -print (completion) |
52 |
| -``` |
| 29 | +## Image Entailment |
| 30 | + |
| 31 | + |
53 | 32 |
|
54 |
| -## Evaluation |
| 33 | + Diffbot LLM an also entail images. |
55 | 34 |
|
56 |
| -### MMLU-Pro |
| 35 | +## Code Interpreter Tool Use |
| 36 | + |
| 37 | + |
| 38 | + |
| 39 | + |
| 40 | +Instead of relying on the model weights for performing empirical calculations, Diffbot LLM is an expert tool user of a Javascript interpreter that it can use to inform it's response. |
| 41 | + |
| 42 | + |
| 43 | + |
| 44 | +## Fun stuff |
57 | 45 |
|
| 46 | + |
58 | 47 |
|
59 |
| -### FreshQA |
| 48 | +Diffbot LLM is an expert maker of ASCII-art weather forecasts, grounded in real sources. |
60 | 49 |
|
61 |
| - |
| 50 | +## 3. Model Download |
62 | 51 |
|
63 |
| -FreshQA is a dynamic question answering benchmark encompassing a diverse range of question and answer types, including questions that |
64 |
| -require fast-changing world knowledge as well as questions with false premises that need to be debunked. |
| 52 | +Available on HuggingFace at: |
| 53 | + * diffbot-small (8b Llama 3.1 fine tune): https://huggingface.co/diffbot/Llama-3.1-Diffbot-Small-2412 |
| 54 | + * diffbot-small-xl (70b Llama 3.3 fine tune): https://huggingface.co/diffbot/Llama-3.3-Diffbot-Small-XL-2412 |
| 55 | + |
| 56 | +## 4. Accuracy Benchmarks |
| 57 | + |
| 58 | +### FreshQA Dataset |
| 59 | + |
| 60 | + |
| 61 | + |
| 62 | +[FreshQA](https://arxiv.org/abs/2310.03214) is a benchmark that measures real-time accuracy for search RAG systems. Diffbot LLM outperforms gpt-4o (no web access), ChatGPT (with web access), Google Gemini, and Perplexity on real-time factual accuracy. |
65 | 63 |
|
66 | 64 | In this evaluation, we focus on 130 FreshQA questions whose answer have changed in 2024, which is after the knowledge
|
67 | 65 | cutoff for all evaluated models as of December 2024.
|
68 | 66 |
|
69 |
| -## Pricing |
| 67 | +### MMLU-Pro |
| 68 | + |
| 69 | +[MMLU-Pro](https://arxiv.org/abs/2406.01574) is a more difficult version of the [MMLU](https://arxiv.org/abs/2009.03300) benchmark that tests for static knowledge of 57 academic subjects using a 10-choice multiple-choice questions. [MMLU-Pro Leaderboard](https://huggingface.co/spaces/TIGER-Lab/MMLU-Pro). |
70 | 70 |
|
71 |
| -Get a free token at: https://app.diffbot.com/get-started/ |
| 71 | +Below shows the MMLU-Pro scores of diffbot-small and diffbot-small-xl over the base models it was fine-tuned from. |
72 | 72 |
|
73 |
| -Contact [email protected] if need more credits or higher limits. |
| 73 | +| Model | Accuracy (CoT 5-shot) | |
| 74 | +| ----- | ----------------- | |
| 75 | +| diffbot-small-xl | 72.89 | |
| 76 | +| Llama-3.3-70B Instruct | 65.92 | |
74 | 77 |
|
75 |
| -## Self-Hosting |
| 78 | +| Model | Accuracy (CoT 5-shot) | |
| 79 | +| ----- | ----------------- | |
| 80 | +| diffbot-small | 48.64 | |
| 81 | +| Llama-3.1-8B Instruct | 44.25 | |
76 | 82 |
|
77 |
| -### Using Docker image and models in huggingface |
| 83 | +Note: This is a measurement of the Diffbot GraphRAG LLM API end-to-end, not a measure of the knowledge contained in the weights. The lift in its performance over the base model comes from its ability to access external tools. |
| 84 | + |
| 85 | + |
| 86 | +## 5. Demo |
| 87 | + |
| 88 | +Try Diffbot LLM using the demo app at https://diffy.chat |
| 89 | + |
| 90 | +## 6. Running Locally |
| 91 | + |
| 92 | +Tested minimum hardware configurations: |
| 93 | + |
| 94 | + - Nvidia A100 40G for diffbot-small |
| 95 | + - Nvidia 2XH100 80G for diffbot-small-xl @ FP8 |
| 96 | + |
| 97 | +Using Docker image and models in huggingface |
78 | 98 | 1. Pull docker image: `docker pull docker.io/diffbot/diffbot-llm-inference:latest`
|
79 | 99 | 2. Run docker image. **Note: The model weights will be automatically downloaded from huggingface.
|
80 | 100 | This might take a few minutes.**
|
81 | 101 |
|
82 | 102 | ```bash
|
83 |
| -docker run --runtime nvidia --gpus all -p 8001:8001 --ipc=host -e VLLM_OPTIONS="--model diffbot/diffbot-small-1.0 --served-model-name diffbot-small --enable-prefix-caching" docker.io/diffbot/diffbot-llm-inference:latest |
| 103 | +docker run --runtime nvidia --gpus all -p 8001:8001 --ipc=host -e VLLM_OPTIONS="--model diffbot/Llama-3.1-Diffbot-Small-2412 --served-model-name diffbot-small --enable-prefix-caching" docker.io/diffbot/diffbot-llm-inference:latest |
84 | 104 | ```
|
| 105 | +## 7. Using the Serverless API |
85 | 106 |
|
86 |
| -## Extending Diffbot LLM Inference Server |
| 107 | +Get a free Diffbot developer token at https://app.diffbot.com/get-started |
87 | 108 |
|
88 |
| -To extend the Diffbot LLM Inference Server with new tools, please refer to [this tutorial](add_tool_to_diffbot_llm_inference.md). |
| 109 | +```python |
| 110 | +from openai import OpenAI |
89 | 111 |
|
90 |
| -## Diffbot-Hosted Service |
| 112 | +client = OpenAI( |
| 113 | + base_url = "https://llm.diffbot.com/rag/v1", |
| 114 | + api_key = "<diffbot_token>" |
| 115 | +) |
91 | 116 |
|
92 |
| -To test the Diffbot LLM Inference Server before self-hosting, set the base_url to `https:/llm.diffbot.com/rag/v1` |
| 117 | +completion = client.chat.completions.create( |
| 118 | + model="diffbot-xl-small", |
| 119 | + temperature=0, |
| 120 | + messages=[ |
| 121 | + { |
| 122 | + "role": "user", |
| 123 | + "content": "What is the Diffbot Knowledge Graph?" |
| 124 | + } |
| 125 | + ] |
| 126 | +) |
| 127 | +print (completion) |
| 128 | +``` |
| 129 | +Contact [email protected] if need more credits or higher limits. |
93 | 130 |
|
94 |
| -```python |
95 |
| -from openai import OpenAI |
96 |
| -# get your free token at https://app.diffbot.com/get-started/ |
97 |
| -diffbot_token = "<YOUR_TOKEN>" |
98 |
| -base_url = "https:/llm.diffbot.com/rag/v1" |
99 |
| -client = OpenAI(api_key=diffbot_token, base_url=base_url) |
100 |
| -``` |
| 131 | +## 8. Adding Custom Tools |
| 132 | + |
| 133 | +To extend the Diffbot LLM Inference Server with new tools, please refer to [this tutorial](add_tool_to_diffbot_llm_inference.md). |
0 commit comments