Overview

Mimic and make a lightweight OpenAI API server endpoint to serve the text-generation service.
Make use of llama.cpp library,created by ggerganov in Pure C/C++, for text-generation service. I might be applied on various platform (embedded device, cloud, mobile(android, iphone) ...)
A simple UI tool to explore/research the capability of text-generation service.

Note

This is demonstration version, some issues or error checking is not fully validated.
Contact me via avble.harry dot gmail.com if any

Tech-stack

A lightweight OpenAI API compatible server: av_connect http server in C++
Text-generation: llama.cp
Web UI: Provide a simple web UI interface to explore/experiment

A snapshot

Quick started

Obtain the latest container from docker hub

docker image pull harryavble/av_llm

Run from docker

docker run -p 8080:8080  harryavble/av_llm:latest

Access to Web interface at http://127.0.0.1:8080

Supported model

LLaMA 1
LLaMA 2
LLaMA 3
Mistral-7B
Mixtral MoE
DBRX
Falcon
Chinese-LLaMA-Alpaca This application is built on the top of llama.cpp, so it should work any model which the llama.cpp supports

Download model and run

docker run -p 8080:8080 -v $your_host_model_folder:/work/model av_llm ./av_llm -m /work/model/$your_model_file

Compile and run

T.B.D

UI

Should work with below UI

huggingface/chat-ui

Future work

Support more LLM models
Support more OpenAI API server
Support more application

Appendix

More Snapshot

Reference

https://platform.openai.com/docs/api-reference/introduction

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
image		image
simplechat		simplechat
log.hpp		log.hpp
main.cpp		main.cpp
readme.md		readme.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Overview

Note

Tech-stack

A snapshot

Quick started

Supported model

Download model and run

Compile and run

UI

Future work

Appendix

More Snapshot

Reference

About

Releases

Packages

Languages

avble/av_llm

Folders and files

Latest commit

History

Repository files navigation

Overview

Note

Tech-stack

A snapshot

Quick started

Supported model

Download model and run

Compile and run

UI

Future work

Appendix

More Snapshot

Reference

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages