Serve on vanilla sglang server with mid-layer #293

SoheylM · 2024-11-20T12:18:44Z

Hi,

Is there a way to create a mid-layer to serve functionary on a vanilla sglang server?

The unit I work in runs a sglang server where I can easily serve any huggingface hosted LLM. But I cannot install the functionary library on the server side and run the server_sglang.py script.

I attempted to create a middleware "server" on the client side without much success.

Thanks in advance for any tips.

mckbrchill · 2024-12-03T14:23:59Z

Same for vllm would be nice.
I've found out that server_vllm.py contains only /chat/completion endpoint

Also it seems that 3.1 medium generates a bunch of backslashes \\\, when it's prompted with a long input sequence. Is it because of the FC tune or FP8 quantization? Haven't tried 3.1 medium non quantized yet.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Serve on vanilla sglang server with mid-layer #293

Serve on vanilla sglang server with mid-layer #293

SoheylM commented Nov 20, 2024

mckbrchill commented Dec 3, 2024

Serve on vanilla sglang server with mid-layer #293

Serve on vanilla sglang server with mid-layer #293

Comments

SoheylM commented Nov 20, 2024

mckbrchill commented Dec 3, 2024