Open LLM

This Open LLM Framework serves as a powerful and flexible tool for generating text embeddings and chat completions using state-of-the-art and open source language models. By leveraging models Transformers, this enables various natural language processing (NLP) tasks to be performed via simple HTTP endpoints similar to openai endpoints.

Provides an easy-to-use API interface to leverage powerful NLP models locally without needing deep expertise in machine learning.
{TD - Allow integration with various applications, including chatbots, content creation tools, and recommendation systems.}
Supports multiple models from Transformers library, enabling diverse NLP tasks.
Utilizes GPU acceleration when available to enhance processing speed and efficiency.
Tunneling to give access to other endpoints
Reduces dependency on external APIs, potentially lowering operational costs.
Enables control over the computational resources used, optimizing for cost and performance.

‍

Notebooks

For GraphRAG:

More Coming Soon...

Usage

Prerequisites
- Python >= 3.10
- Docker >= 23.0.3
Source

# Clone the Repository
git clone https://github.com/rushizirpe/open-llm-server.git

# Install Dependencies
cd open-llm-server
pip install -e .

#  Launch server
llm-server start --host 127.0.0.1 --port 8888 --reload

Params:

start: Start the server
stop: Stop the server
status: Check the server status
--host: Specify the host IP (default: 127.0.0.1)
--port: Specify the port number (default: 8888)
--reload: Enable auto-reload for development

API Endpoints

Chat Completions: /v1/chat/completions
Embeddings: /v1/embeddings
System Metrics: /v1/metrics

DockerHub

# Pull Docker Image
docker pull thisisrishi/open-llm-server

# Run Docker
docker run -it -p 8888:8888 thisisrishi/open-llm-server:latest

OR

# Run on Custom Port
docker run -e PORT=8000 -p 8000:8000 thisisrishi/open-llm-server:latest

OR

# Create and Start Container
docker compose up

‍

Endpoints

Health Check
- URL: /
- Method: GET
- Description: Check the status of the API and the availability of a GPU.
Usage

curl http://localhost:8888/

Response:

{
    "status": "System Status: Operational",
    "gpu": "Available",
    "gpu_details": {
        "GPU 0": {
            "compute_capability": "(8, 9)",
            "device_name": "NVIDIA L4"
            }
    }
}

Embeddings
- URL: /v1/embeddings
- Method: POST
- Description: Generate embeddings for a list of input texts using a specified model.
Usage

curl http://localhost:8888/v1/embeddings \
    -H "Content-Type: application/json" \
    -H "Authorization: Bearer DUMMY_KEY"  \
    -d '{"input": "the quick brown fox", "model": "nomic-ai/nomic-embed-text-v1.5"}'

Response:

{
    "object": "list",
    "data": [
        {"embedding": [0.56324344, 0.25775233, -0.123355], "index": 0},
        {"embedding": [0.30823462, -0.23636326, 0.543345], "index": 1}
    ],
    "model": "nomic-ai/nomic-embed-text-v1.5",
    "usage": {"total_tokens": 5}
}

Chat Completions
- URL: /v1/chat/completions
- Method: POST
- Description: Generate chat completions based on conversation history using a specified model.
Request Body:

{
    "model": "openai-community/gpt2",
    "messages": [
        {"role": "user", "content": "Hi!"},
        {"role": "assistant", "content": "Hi there! How can I help you today?"}
    ],
    "max_tokens": 150,
    "temperature": 0.7,
    "top_p": 1.0,
    "n": 1,
    "stop": null
}

Usage

curl http://localhost:8888/v1/chat/completions \
    -H "Content-Type: application/json" \
    -H "Authorization: Bearer DUMMY_KEY" \
    -d '{"model": "openai-community/gpt2","messages": [{"role": "user", "content": "Hi!"}],"max_tokens": 150,"temperature": 0.7}'

Response:

{
    "choices": [
        {"index": 0, "text": "Hello, I can help you with a variety of tasks, such as ..."}
    ]
}

License

This project is licensed under the MIT License. See the LICENSE file for more details.

Contributing

Contributions are welcome! Please open an issue or submit a pull request for any improvements or bug fixes.

Contact

For any inquiries or support, please contact me.

Name		Name	Last commit message	Last commit date
Latest commit History 47 Commits
scripts		scripts
src		src
tests		tests
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
docker-compose.yml		docker-compose.yml
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Open LLM

Notebooks

Usage

Prerequisites

Source

API Endpoints

DockerHub

Endpoints

Health Check

Embeddings

Chat Completions

License

Contributing

Contact

About

Releases

Packages

Contributors 2

Languages

License

rushizirpe/open-llm-server

Folders and files

Latest commit

History

Repository files navigation

Open LLM

Notebooks

Usage

Prerequisites

Source

API Endpoints

DockerHub

Endpoints

Health Check

Embeddings

Chat Completions

License

Contributing

Contact

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages