diffbot
diff --git a/‎.gitignore
Lines changed: 11 additions & 0 deletions b/‎.gitignore
Lines changed: 11 additions & 0 deletions
diff --git a/‎Dockerfile
Lines changed: 27 additions & 0 deletions b/‎Dockerfile
Lines changed: 27 additions & 0 deletions
diff --git a/‎README.md
Lines changed: 100 additions & 2 deletions b/‎README.md
Lines changed: 100 additions & 2 deletions
diff --git a/‎add_tool_to_diffbot_llm_inference.md
Lines changed: 111 additions & 0 deletions b/‎add_tool_to_diffbot_llm_inference.md
Lines changed: 111 additions & 0 deletions
@@ -0,0 +1,11 @@
+# python
+**/__pycache__/
+/.vscode/
+/.venv/
+/.pytest_cache/
+/.env/
+/tmp/
+.idea/
+*.iml
+.cache/
+data/
@@ -0,0 +1,27 @@
+FROM vllm/vllm-openai:latest
+
+# Install required packages
+RUN apt-get update && apt-get install -y \
+   supervisor \
+   && apt-get clean
+
+# Copy Supervisor configuration
+COPY supervisord.conf /etc/supervisor/conf.d/supervisord.conf
+
+# Copy diffbot-llm code
+COPY . /code
+
+WORKDIR /code
+
+# Install requirements
+RUN pip install poetry
+RUN pip install pyasynchat # required by supervisord
+RUN poetry env use python3.10
+RUN poetry export -f requirements.txt --output requirements.txt --without-hashes
+RUN poetry run pip install --no-cache-dir --upgrade -r /code/requirements.txt
+
+# Expose ports
+EXPOSE 3333 8000
+
+# Start Supervisor
+ENTRYPOINT ["/usr/bin/supervisord"]
@@ -1,2 +1,100 @@
-# diffbot-llm-inference
-DIffbot LLM Inference Server
+# Diffbot LLM
+
+Diffbot LLM is an API that combines the reasoning ability of frontier large language models with the knowledge 
+found in Diffbot's Knowledge Graph and realtime web index using Retrieval Augmented Generation. 
+
+Diffbot LLM is best suited for applications that require maximum accuracy and 
+factual grounding with authoritative citations.
+
+Diffbot LLM API is fully compatible with OpenAI's Chat Completion API and can be used as a drop-in replacement.
+
+## Diffbot LLM Inference Server
+
+Diffbot LLM is available as a service hosted by Diffbot, or it can be self-hosted with the following open-source models:
+ * diffbot-small (8b llama 3.1 fine tune): https://huggingface.co/diffbot/diffbot-small-1.0
+ * diffbot-small-xl (70b llama 3.1 fine tune): https://huggingface.co/diffbot/diffbot-small-xl-1.0
+
+This repo contains the Diffbot LLM Inference, a system to serve the open-source Diffbot LLM models
+with built-in tool calling for Diffbot's Knowledge Graph and realtime web index.
+
+This system requires an Nvidia GPU with at least 80G VRAM.
+
+## Demo
+
+A demo of Diffbot LLM is available at https://diffy.chat
+
+![DiffyChat](static/demo.png)
+
+## Usage
+
+### API
+
+#### Who is the CEO of Nike?
+
+```python
+from openai import OpenAI
+# get your free token at https://app.diffbot.com/get-started/
+diffbot_token = "<YOUR_TOKEN>"
+base_url = "http://<YOUR_SERVER>:8001/rag/v1" # or https:/llm.diffbot.com/rag/v1 for Diffbot-hosted
+client = OpenAI(api_key=diffbot_token, base_url=base_url)
+completion = client.chat.completions.create(
+    model="diffbot-small",
+    temperature=0,
+    messages=[
+        {"role": "system", "content": "You are a helpful assistant."},
+        {
+            "role": "user",
+            "content": "Who is Nike's CEO?"
+        }
+    ]
+)
+print (completion)
+```
+
+## Evaluation
+
+### MMLU-Pro
+
+
+### FreshQA
+
+![Accuracy for FreshQA 2024 queries](static/freshqa.png)
+
+FreshQA  is a dynamic question answering benchmark encompassing a diverse range of question and answer types, including questions that 
+require fast-changing world knowledge as well as questions with false premises that need to be debunked.
+
+In this evaluation, we focus on 130 FreshQA questions whose answer have changed in 2024, which is after the knowledge
+cutoff for all evaluated models as of December 2024.
+
+## Pricing
+
+Get a free token at: https://app.diffbot.com/get-started/
+
+Contact [email protected] if need more credits or higher limits.
+
+## Self-Hosting
+
+### Using Docker image and models in huggingface 
+1. Pull docker image: `docker pull docker.io/diffbot/diffbot-llm-inference:latest`
+2. Run docker image. **Note: The model weights will be automatically downloaded from huggingface. 
+This might take a few minutes.**
+
+```bash
+docker run --runtime nvidia --gpus all -p 8001:8001 --ipc=host -e VLLM_OPTIONS="--model diffbot/diffbot-small-1.0 --served-model-name diffbot-small --enable-prefix-caching"  docker.io/diffbot/diffbot-llm-inference:latest 
+```
+
+## Extending Diffbot LLM Inference Server
+
+To extend the Diffbot LLM Inference Server with new tools, please refer to [this tutorial](add_tool_to_diffbot_llm_inference.md).
+
+## Diffbot-Hosted Service
+
+To test the Diffbot LLM Inference Server before self-hosting, set the base_url to `https:/llm.diffbot.com/rag/v1`
+
+```python
+from openai import OpenAI
+# get your free token at https://app.diffbot.com/get-started/
+diffbot_token = "<YOUR_TOKEN>"
+base_url = "https:/llm.diffbot.com/rag/v1"
+client = OpenAI(api_key=diffbot_token, base_url=base_url)
+```
@@ -0,0 +1,111 @@
+# How to Add a Tool to Diffbot LLM Inference
+
+We discuss how to add a tool to Diffbot LLM Inference by showing how to
+add the tool `execute_js_v1` for javascript code execution.
+
+### 0. Set up local development
+
+To set up the virtual environment:
+
+```
+poetry env use python3.10
+poetry shell
+poetry install
+```
+
+To start vLLM:
+ 
+Self-host one the Diffbot LLM models with docker (see [Self-Hosting](README.md)) and add "-p 8000:8000" to expose 
+the vLLM endpoint. Set the vLLM endpoint in config.py.
+
+To start the server: `./start_server.sh`
+
+### 1. Add the new tool to the system prompt (`system_prompt.txt`).
+
+Below is the original system prompt, which includes the definition of the
+available tools in javascript.
+```
+You are a helpful assistant with access to the following functions. Use them if required -
+namespace Diffbot {
+// Extract the content from the given URLs. Only call this endpoint if the user mentioned a URL.
+type extract_v1 = (_: {
+// URLs to extract, up to 5
+page_url: string[],
+}) => any;
+// Query the Diffbot Knowledge Graph for an entity or set of entities that match a set of criteria using the Diffbot Query Language syntax.
+type dql_v1 = (_: {
+// Diffbot Query Language query
+dql_query: string,
+}) => any;
+// Search the web for information that could help answer the user's question.
+type web_search_v1 = (_: {
+// List of Google advanced search strings (can include phrases, booleans, site:, before:, after:, filetype:, etc)
+text: string[],
+// Number of results to return (default 5)
+num?: number,
+// Page number of results to return (default 1)
+page?: number,
+}) => any;
+} // namespace Diffbot
+```
+
+To add the tool `execute_js_v1`, we can add the following lines as the last tool:
+
+```
+// Execute JavaScript expressions and get accurate results that could help answer the user's question.
+type execute_js_v1 = (_: {
+// JavaScript expressions to execute separated by newlines
+expressions: string,
+}) => any;
+```
+
+The final result is:
+
+```
+You are a helpful assistant with access to the following functions. Use them if required -
+namespace Diffbot {
+// Extract the content from the given URLs. Only call this endpoint if the user mentioned a URL.
+type extract_v1 = (_: {
+// URLs to extract, up to 5
+page_url: string[],
+}) => any;
+// Query the Diffbot Knowledge Graph for an entity or set of entities that match a set of criteria using the Diffbot Query Language syntax.
+type dql_v1 = (_: {
+// Diffbot Query Language query
+dql_query: string,
+}) => any;
+// Search the web for information that could help answer the user's question.
+type web_search_v1 = (_: {
+// List of Google advanced search strings (can include phrases, booleans, site:, before:, after:, filetype:, etc)
+text: string[],
+// Number of results to return (default 5)
+num?: number,
+// Page number of results to return (default 1)
+page?: number,
+}) => any;
+// Execute JavaScript expressions and get accurate results that could help answer the user's question.
+type execute_js_v1 = (_: {
+// JavaScript expressions to execute separated by newlines
+expressions: string,
+}) => any;
+} // namespace Diffbot
+```
+
+### 2. Implement the new tool
+
+See `services/execute_js.py` for the implementation of `execute_js_v1`.
+
+### 3. Call the tool in llm/plugin.py
+
+The `invoke` method is responsible for calling tools requested by the LLM. To call the new tool, we can add the
+following lines to this method:
+
+```python
+if function_name == "execute_js_v1":
+    resp = await get_js_execution_service().execute_js(function_arguments["expressions"])
+    return PluginResponse(
+        plugin_url=function_name, method="INTERNAL", content=resp.json()
+    )
+```
+
+where `get_js_execution_service().execute_js()` calls the implementation for this new tool.