Skip to content

Commit ec485a6

Browse files
Initial commit
1 parent b228210 commit ec485a6

35 files changed

+5778
-2
lines changed

.gitignore

Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,11 @@
1+
# python
2+
**/__pycache__/
3+
/.vscode/
4+
/.venv/
5+
/.pytest_cache/
6+
/.env/
7+
/tmp/
8+
.idea/
9+
*.iml
10+
.cache/
11+
data/

Dockerfile

Lines changed: 27 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,27 @@
1+
FROM vllm/vllm-openai:latest
2+
3+
# Install required packages
4+
RUN apt-get update && apt-get install -y \
5+
supervisor \
6+
&& apt-get clean
7+
8+
# Copy Supervisor configuration
9+
COPY supervisord.conf /etc/supervisor/conf.d/supervisord.conf
10+
11+
# Copy diffbot-llm code
12+
COPY . /code
13+
14+
WORKDIR /code
15+
16+
# Install requirements
17+
RUN pip install poetry
18+
RUN pip install pyasynchat # required by supervisord
19+
RUN poetry env use python3.10
20+
RUN poetry export -f requirements.txt --output requirements.txt --without-hashes
21+
RUN poetry run pip install --no-cache-dir --upgrade -r /code/requirements.txt
22+
23+
# Expose ports
24+
EXPOSE 3333 8000
25+
26+
# Start Supervisor
27+
ENTRYPOINT ["/usr/bin/supervisord"]

README.md

Lines changed: 100 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,2 +1,100 @@
1-
# diffbot-llm-inference
2-
DIffbot LLM Inference Server
1+
# Diffbot LLM
2+
3+
Diffbot LLM is an API that combines the reasoning ability of frontier large language models with the knowledge
4+
found in Diffbot's Knowledge Graph and realtime web index using Retrieval Augmented Generation.
5+
6+
Diffbot LLM is best suited for applications that require maximum accuracy and
7+
factual grounding with authoritative citations.
8+
9+
Diffbot LLM API is fully compatible with OpenAI's Chat Completion API and can be used as a drop-in replacement.
10+
11+
## Diffbot LLM Inference Server
12+
13+
Diffbot LLM is available as a service hosted by Diffbot, or it can be self-hosted with the following open-source models:
14+
* diffbot-small (8b llama 3.1 fine tune): https://huggingface.co/diffbot/diffbot-small-1.0
15+
* diffbot-small-xl (70b llama 3.1 fine tune): https://huggingface.co/diffbot/diffbot-small-xl-1.0
16+
17+
This repo contains the Diffbot LLM Inference, a system to serve the open-source Diffbot LLM models
18+
with built-in tool calling for Diffbot's Knowledge Graph and realtime web index.
19+
20+
This system requires an Nvidia GPU with at least 80G VRAM.
21+
22+
## Demo
23+
24+
A demo of Diffbot LLM is available at https://diffy.chat
25+
26+
![DiffyChat](static/demo.png)
27+
28+
## Usage
29+
30+
### API
31+
32+
#### Who is the CEO of Nike?
33+
34+
```python
35+
from openai import OpenAI
36+
# get your free token at https://app.diffbot.com/get-started/
37+
diffbot_token = "<YOUR_TOKEN>"
38+
base_url = "http://<YOUR_SERVER>:8001/rag/v1" # or https:/llm.diffbot.com/rag/v1 for Diffbot-hosted
39+
client = OpenAI(api_key=diffbot_token, base_url=base_url)
40+
completion = client.chat.completions.create(
41+
model="diffbot-small",
42+
temperature=0,
43+
messages=[
44+
{"role": "system", "content": "You are a helpful assistant."},
45+
{
46+
"role": "user",
47+
"content": "Who is Nike's CEO?"
48+
}
49+
]
50+
)
51+
print (completion)
52+
```
53+
54+
## Evaluation
55+
56+
### MMLU-Pro
57+
58+
59+
### FreshQA
60+
61+
![Accuracy for FreshQA 2024 queries](static/freshqa.png)
62+
63+
FreshQA is a dynamic question answering benchmark encompassing a diverse range of question and answer types, including questions that
64+
require fast-changing world knowledge as well as questions with false premises that need to be debunked.
65+
66+
In this evaluation, we focus on 130 FreshQA questions whose answer have changed in 2024, which is after the knowledge
67+
cutoff for all evaluated models as of December 2024.
68+
69+
## Pricing
70+
71+
Get a free token at: https://app.diffbot.com/get-started/
72+
73+
Contact [email protected] if need more credits or higher limits.
74+
75+
## Self-Hosting
76+
77+
### Using Docker image and models in huggingface
78+
1. Pull docker image: `docker pull docker.io/diffbot/diffbot-llm-inference:latest`
79+
2. Run docker image. **Note: The model weights will be automatically downloaded from huggingface.
80+
This might take a few minutes.**
81+
82+
```bash
83+
docker run --runtime nvidia --gpus all -p 8001:8001 --ipc=host -e VLLM_OPTIONS="--model diffbot/diffbot-small-1.0 --served-model-name diffbot-small --enable-prefix-caching" docker.io/diffbot/diffbot-llm-inference:latest
84+
```
85+
86+
## Extending Diffbot LLM Inference Server
87+
88+
To extend the Diffbot LLM Inference Server with new tools, please refer to [this tutorial](add_tool_to_diffbot_llm_inference.md).
89+
90+
## Diffbot-Hosted Service
91+
92+
To test the Diffbot LLM Inference Server before self-hosting, set the base_url to `https:/llm.diffbot.com/rag/v1`
93+
94+
```python
95+
from openai import OpenAI
96+
# get your free token at https://app.diffbot.com/get-started/
97+
diffbot_token = "<YOUR_TOKEN>"
98+
base_url = "https:/llm.diffbot.com/rag/v1"
99+
client = OpenAI(api_key=diffbot_token, base_url=base_url)
100+
```

add_tool_to_diffbot_llm_inference.md

Lines changed: 111 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,111 @@
1+
# How to Add a Tool to Diffbot LLM Inference
2+
3+
We discuss how to add a tool to Diffbot LLM Inference by showing how to
4+
add the tool `execute_js_v1` for javascript code execution.
5+
6+
### 0. Set up local development
7+
8+
To set up the virtual environment:
9+
10+
```
11+
poetry env use python3.10
12+
poetry shell
13+
poetry install
14+
```
15+
16+
To start vLLM:
17+
18+
Self-host one the Diffbot LLM models with docker (see [Self-Hosting](README.md)) and add "-p 8000:8000" to expose
19+
the vLLM endpoint. Set the vLLM endpoint in config.py.
20+
21+
To start the server: `./start_server.sh`
22+
23+
### 1. Add the new tool to the system prompt (`system_prompt.txt`).
24+
25+
Below is the original system prompt, which includes the definition of the
26+
available tools in javascript.
27+
```
28+
You are a helpful assistant with access to the following functions. Use them if required -
29+
namespace Diffbot {
30+
// Extract the content from the given URLs. Only call this endpoint if the user mentioned a URL.
31+
type extract_v1 = (_: {
32+
// URLs to extract, up to 5
33+
page_url: string[],
34+
}) => any;
35+
// Query the Diffbot Knowledge Graph for an entity or set of entities that match a set of criteria using the Diffbot Query Language syntax.
36+
type dql_v1 = (_: {
37+
// Diffbot Query Language query
38+
dql_query: string,
39+
}) => any;
40+
// Search the web for information that could help answer the user's question.
41+
type web_search_v1 = (_: {
42+
// List of Google advanced search strings (can include phrases, booleans, site:, before:, after:, filetype:, etc)
43+
text: string[],
44+
// Number of results to return (default 5)
45+
num?: number,
46+
// Page number of results to return (default 1)
47+
page?: number,
48+
}) => any;
49+
} // namespace Diffbot
50+
```
51+
52+
To add the tool `execute_js_v1`, we can add the following lines as the last tool:
53+
54+
```
55+
// Execute JavaScript expressions and get accurate results that could help answer the user's question.
56+
type execute_js_v1 = (_: {
57+
// JavaScript expressions to execute separated by newlines
58+
expressions: string,
59+
}) => any;
60+
```
61+
62+
The final result is:
63+
64+
```
65+
You are a helpful assistant with access to the following functions. Use them if required -
66+
namespace Diffbot {
67+
// Extract the content from the given URLs. Only call this endpoint if the user mentioned a URL.
68+
type extract_v1 = (_: {
69+
// URLs to extract, up to 5
70+
page_url: string[],
71+
}) => any;
72+
// Query the Diffbot Knowledge Graph for an entity or set of entities that match a set of criteria using the Diffbot Query Language syntax.
73+
type dql_v1 = (_: {
74+
// Diffbot Query Language query
75+
dql_query: string,
76+
}) => any;
77+
// Search the web for information that could help answer the user's question.
78+
type web_search_v1 = (_: {
79+
// List of Google advanced search strings (can include phrases, booleans, site:, before:, after:, filetype:, etc)
80+
text: string[],
81+
// Number of results to return (default 5)
82+
num?: number,
83+
// Page number of results to return (default 1)
84+
page?: number,
85+
}) => any;
86+
// Execute JavaScript expressions and get accurate results that could help answer the user's question.
87+
type execute_js_v1 = (_: {
88+
// JavaScript expressions to execute separated by newlines
89+
expressions: string,
90+
}) => any;
91+
} // namespace Diffbot
92+
```
93+
94+
### 2. Implement the new tool
95+
96+
See `services/execute_js.py` for the implementation of `execute_js_v1`.
97+
98+
### 3. Call the tool in llm/plugin.py
99+
100+
The `invoke` method is responsible for calling tools requested by the LLM. To call the new tool, we can add the
101+
following lines to this method:
102+
103+
```python
104+
if function_name == "execute_js_v1":
105+
resp = await get_js_execution_service().execute_js(function_arguments["expressions"])
106+
return PluginResponse(
107+
plugin_url=function_name, method="INTERNAL", content=resp.json()
108+
)
109+
```
110+
111+
where `get_js_execution_service().execute_js()` calls the implementation for this new tool.

0 commit comments

Comments
 (0)